Abstract: Sparse representation has been successfully applied to visual tracking for finding the suitable candidate by using the target templates. But most of the sparse representation based trackers only consider the holistic representation of the target object and do not make the full use of the sparse coefficients to discriminate between target and background. Hence may fail with more possibility, when there is similar object or occlusion in the scene. This paper studies the visual tracking problem in video sequences and presents a sparse tracker using coarse and fine dictionaries. This representation exploits both partial and structural information of the target based on averaging and alignment-pooling method. The similarity obtained by pooling across the local patches helps not only locate the target more accurately but also handle partial occlusion. Object detection, identification and tracking are the three main objectives of this paper. For object/person detection a superpixel based face detection algorithm is used here that is followed by moment-based matching and isosceles triangle matching. Object tracking can be done by using extended Kalman filtering method. In addition, this method employs a template update strategy which combines incremental subspace learning and local sparse representation. This strategy adapts the template to the appearance change of the target with less possibility of drifting and reduces the influence of the occluded target template as well. The proposed algorithm is superior in accuracy and it has better robustness against to partial and full occlusion.
Keywords: Object tracking, sparse coding, averaging, alignment-pooling, occlusion detection.